Data specifics

  • The Longitudinal Employer Household Dynamics (LEHD) program at the US Census Bureau releases the Origin Destination Employment Statistis (LODES) datasets annually based on employer-employee insurance records.
  • This datafile uses data from the Origin-Destination (OD) data files from LEHD. The OD datafile lists each pair the census blocks for where workers live and work, enabling us to calculate the average commute distance by calculating the distance between each home and workplace census block pairing.
  • Distance calculations: All distances are "as the crow flies" and calulated using the Vincenty Ellipsoid method based on the latitude and longitudes of the centroids of each census block. These distances are then aggregated to the census block group and tract level.
  • Data presented here are from 2018 and spatial units are based on the 2010 census. As of July of 2021, 2018 is the most recent year for which data are available. The earliest year for which data are available is 2002.
  • The data contains average commute distances for each SU calculated based on the following groups: (1) People who live and work in the Charlottesville region (avgc_workinRegion); (2) Charlottesville area residents who work within 150 miles of their home census block (avgc_within150); (3) Charlottesville area residents who work in a tract outside the Charlottesville area that employs at least 25 Charlottsville area residents (avgc_25_employees); (4) All Charlottesville area residents represented in the LODES OD data (avg_all)
  • Some limitations: jobs counts do not include those working in defense-related industries; the data are prone to imperfect geocoding for certain jobs (jobs for companies with multiple branches are often all coded in the same location); although there are datasets from 2002-2018, these data are not suitable for longitudinal analysis; and student-workers are unlikely to be represented in these data because their jobs are not typically covered by state unemployment insurance.

Variable descriptions

meta %>% 
  filter(su_blkgp == 1) %>%
  select(varname, about) %>% as.list()
## $varname
## [1] "avgc_all"          "avgc_within150"    "avgc_25_employees"
## [4] "avgc_workinRegion" "blkgroup"          "county"           
## 
## $about
## [1] "Average \"as the crow flies\" commuting distance for all workers in the SU"                                                                                                                                                                          
## [2] "Average \"as the crow flies\" commuting distance for workers in the SU who commute within 150 miles"                                                                                                                                                 
## [3] "Average \"as the crow flies\" commuting distance for workers in the SU who commute within the region of interest and those who commute to a census tract outside the region of interest that employs at least 25 residents of the region of interest"
## [4] "Average \"as the crow flies\" commuting distance for workers in the SU who work in the same region"                                                                                                                                                  
## [5] "12-digit census block group code"                                                                                                                                                                                                                    
## [6] "5-digit county code"
glimpse(lodes)
## Rows: 155
## Columns: 6
## $ blkgroup          <dbl> 510030101001, 510030101002, 510030101003, 5100301020…
## $ avgc_workinRegion <dbl> 5.402613, 11.199065, 7.467619, 6.103517, 4.534515, 3…
## $ avgc_within150    <dbl> 24.98645, 30.40102, 23.19736, 19.56142, 18.36815, 21…
## $ avgc_25_employees <dbl> 17.85653, 26.41289, 20.48374, 17.45696, 14.67376, 19…
## $ avgc_all          <dbl> 28.94010, 32.00603, 27.38450, 21.26024, 18.93745, 24…
## $ county            <int> 51003, 51003, 51003, 51003, 51003, 51003, 51003, 510…
lodes %>% select(avgc_all, avgc_within150, avgc_25_employees, avgc_workinRegion) %>% 
  select(where(~is.numeric(.x))) %>% 
  as.data.frame() %>% 
  stargazer(., type = "text", title = "Summary Statistics", digits = 2,
            summary.stat = c("mean", "sd", "min", "median", "max"))
## 
## Summary Statistics
## ===================================================
## Statistic         Mean  St. Dev.  Min  Median  Max 
## ---------------------------------------------------
## avgc_all          27.51   8.94   14.98 25.51  77.57
## avgc_within150    24.58   7.80   13.63 23.20  68.15
## avgc_25_employees 21.51   7.55   10.66 20.31  62.82
## avgc_workinRegion 7.93    5.81   1.51   5.58  25.47
## ---------------------------------------------------

Visual distribution

lodes %>% select(c(blkgroup, avgc_all, avgc_within150, avgc_25_employees, avgc_workinRegion)) %>% 
  pivot_longer(-blkgroup, names_to = "measure", values_to = "value") %>% 
  ggplot(aes(x = value, fill = measure)) + 
  scale_fill_viridis(option = "plasma", discrete = TRUE, guide = FALSE) +
  geom_histogram() + 
  facet_wrap(~measure, scales = "free")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

meta %>% 
  filter(varname %in% c("avgc_all", "avgc_within150", "avgc_25_employees", "avgc_workinRegion")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_all: Average "as the crow flies" commuting distance for all workers in the SU"
[2] "avgc_within150: Average "as the crow flies" commuting distance for workers in the SU who commute within 150 miles"
[3] "avgc_25_employees: Average "as the crow flies" commuting distance for workers in the SU who commute within the region of interest and those who commute to a census tract outside the region of interest that employs at least 25 residents of the region of interest" [4] "avgc_workinRegion: Average "as the crow flies" commuting distance for workers in the SU who work in the same region"

Mapping the data

All Charlottesville region workers

pal <- colorNumeric("plasma", reverse = T, domain = cvl_lodes$avgc_all)
leaflet(cvl_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = cvl_lodes,
              fillColor = ~pal(avgc_all),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", cvl_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(cvl_lodes$avgc_all, 2))) %>% 
  addLegend("bottomright", pal = pal, values = cvl_lodes$avgc_all, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_all")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_all: Average "as the crow flies" commuting distance for all workers in the SU"

Charlottesvile region workers who work within 150 miles of home

pal <- colorNumeric("plasma", reverse = T, domain = cvl_lodes$avgc_within150)
leaflet(cvl_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = cvl_lodes,
              fillColor = ~pal(avgc_within150),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", cvl_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(cvl_lodes$avgc_within150, 2))) %>% 
  addLegend("bottomright", pal = pal, values = cvl_lodes$avgc_within150, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_within150")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_within150: Average "as the crow flies" commuting distance for workers in the SU who commute within 150 miles"

Charlottesvile region workers who work in a tract that employs >=25 Charlottesville region residents

pal <- colorNumeric("plasma", reverse = T, domain = cvl_lodes$avgc_25_employees)
leaflet(cvl_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = cvl_lodes,
              fillColor = ~pal(avgc_25_employees),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", cvl_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(cvl_lodes$avgc_25_employees, 2))) %>% 
  addLegend("bottomright", pal = pal, values = cvl_lodes$avgc_25_employees, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_25_employees")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_25_employees: Average "as the crow flies" commuting distance for workers in the SU who commute within the region of interest and those who commute to a census tract outside the region of interest that employs at least 25 residents of the region of interest"

Charlottesville region only

pal <- colorNumeric("plasma", reverse = T, domain = cvl_lodes$avgc_workinRegion)
leaflet(cvl_lodes) %>% 
  addProviderTiles("CartoDB.Positron") %>% 
  addPolygons(data = cvl_lodes,
              fillColor = ~pal(avgc_workinRegion),
              weight = 1,
              opacity = 1,
              color = "white", 
              fillOpacity = 0.6,
              highlight = highlightOptions(
                weight = 1, fillOpacity = 0.8, bringToFront = T
              ),
              popup = paste0("GEOID: ", cvl_lodes$blkgroup, "<br>",
                             "Average commute (mi): ", round(cvl_lodes$avgc_workinRegion, 2))) %>% 
  addLegend("bottomright", pal = pal, values = cvl_lodes$avgc_workinRegion, 
            title = "Average commute (mi)", opacity = 0.7)
meta %>% 
  filter(varname %in% c("avgc_workinRegion")) %>%
  mutate(label = paste0(varname, ": ", about)) %>% 
  select(label) %>% 
  as.list()

$label [1] "avgc_workinRegion: Average "as the crow flies" commuting distance for workers in the SU who work in the same region"